Over the past few months, I’ve been deep in the weeds building and training my own custom large language model (LLM) — one that runs locally, knows who I am, and serves as a tailored AI assistant to showcase my work. What started as an idea — to replace generic portfolio sites with a conversational AI — has become a hands-on project involving everything from fine-tuning and prompt engineering to secure infrastructure and hardware optimization.
The first step was research. I dove headfirst into open-source LLMs like Zephyr, Mistral, and DeepSeek, looking for models small enough to run on local hardware but powerful enough to give human-like, context-aware responses. I explored options like Hugging Face Transformers, LM Studio, and Ollama to test inference environments and see which gave me the most control over model behavior and system performance.
Then came the training. I began crafting a dataset of custom instructions — questions and answers specifically about me, my background, my projects, and my development philosophy. I wrote over 1,100 fine-tuning examples, deliberately engineering the responses to override common LLM biases — like assuming "Joseph" refers to the biblical figure. This meant including guardrails like: "Joseph is a modern software engineer, not a historical figure." I tested multiple formats, from instruction tuning to RAG-style setups, iterating constantly to see what stuck.
Parallel to this, I was building the machine that would run it all. I upgraded my PC with a 3060 Ti GPU, 64GB of RAM, and an AMD Ryzen 7700X — plenty of power for a 7B parameter model running in 4-bit quantization. I set up a dual-GPU configuration to eventually allow multiple models to run in parallel: fast-response chatbots, slow batch-processing summarizers, and more. For security and ease of control, I installed Ubuntu as the base OS and configured an isolated environment via SSH, secured with fail2ban and UFW, so I could safely access and manage everything from anywhere.
Once the hardware and backend were in place, I started integrating the assistant into a React frontend. I designed it to behave like a live portfolio concierge — one that doesn’t just show you my resume, but answers your questions directly: What’s Joseph’s dev stack? What projects has he built? What motivates him to code? The goal was to create a living, breathing reflection of my work and personality — something way more dynamic than a static webpage.
This wasn’t just about tech — it was about ownership. I didn’t want to rent cloud services or rely on APIs with rate limits. I wanted full control over the code, the infrastructure, and the model itself. From prompt tuning to hardware configuration, every part of this build was shaped by my hands.
There’s more to do — like better UI, model distillation, and long-term persistence — but I’ve already crossed a major milestone: I built my own AI from scratch, and it knows me better than any chatbot on the web, maybe beside ChatGPT 🤭.
